Histogram Domain Ordering for Path Selectivity Estimation
نویسندگان
چکیده
We aim to improve the accuracy of path selectivity estimation in graph databases by intelligently ordering the domain of a histogram used for estimation. This problem has not, to our knowledge, received adequate attention in the research community. We present a novel framework for the systematic study of path ordering strategies in histogram construction and use. In this framework, we introduce new ordering strategies which we experimentally demonstrate lead to significant improvement of the accuracy of path selectivity estimation over current strategies. These positive results highlight the fundamental role that domain ordering plays in the design of effective histograms for efficient and scalable graph query processing.
منابع مشابه
Bloom Histogram: Path Selectivity Estimation for XML Data with Updates
Cost-based XML query optimization calls for accurate estimation of the selectivity of path expressions. Some other interactive and internet applications can also benefit from such estimations. While there are a number of estimation techniques proposed in the literature, almost none of them has any guarantee on the estimation accuracy within a given space limit. In addition, most of them assume ...
متن کاملCXHist : An On-line Classification-Based Histogram for XML String Selectivity Estimation
Query optimization in IBM’s System RX, the first truly relational-XML hybrid data management system, requires accurate selectivity estimation of path-value pairs, i.e., the number of nodes in the XML tree reachable by a given path with the given text value. Previous techniques have been inadequate, because they have focused mainly on the tag-labeled paths (tree structure) of the XML data. For m...
متن کاملCost Estimation Techniques for Database Systems
This dissertation is about developing advanced selectivity and cost estimation techniques for query optimization in database systems. It addresses the following three issues related to current trends in database research: estimating the cost of spatial selections, building histograms without looking at data, and estimating the selectivity of XML path expressions. The first part of this disserta...
متن کاملQuery Selectivity Estimation Based on Improved V-optimal Histogram by Introducing Information about Distribution of Boundaries of Range Query Conditions
Selectivity estimation is a parameter used by a query optimizer for early estimation of the size of data that satisfies query condition. Selectivity is calculated using an estimator of distribution of attribute values of attribute involved in a processed query condition. Histograms built on attributes values from a database may be such representation of the distribution. The paper introduces a ...
متن کاملSummarizing Spatial Relations - A Hybrid Histogram
Summarizing topological relations is fundamental to many spatial applications including spatial query optimization. In this paper, we examine the selectivity estimation for range window query to summarize the four important topological relations: contains, contained, overlap, and disjoint. We propose a novel hybrid histogrammethod which uses the concept of Min-skew partition in conjunction with...
متن کامل